{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Manipulating pages"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "pikepdf presents the pages in a PDF through the ``Pdf.pages`` property, which\n",
    "follows the ``list`` protocol. As such page numbers begin at 0.\n",
    "\n",
    "Let's look at a simple PDF that contains four pages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pikepdf import Pdf\n",
    "\n",
    "pdf = Pdf.open('../../tests/resources/fourpages.pdf')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "How many pages?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "len(pdf.pages)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thanks to IPython's rich Python object representations you can view the PDF while you work on it if you execute this IPython notebook. Click the *View PDF* link below to view the file. **You can view the PDF after change you make.** If you're reading this documentation online or as part of distribution, you won't see the rich representation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pdf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also examine individual pages, which we'll explore in the next section. Suffice to say that you can access pages by indexing them and slicing them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pdf.pages[-1].MediaBox"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Suppose the file was scanned backwards. We can easily reverse it in place - maybe it was scanned backwards, a common problem with automatic document scanners. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pdf.pages.reverse()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pdf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pretty nice, isn't it? Of course, the pages in this file are in correct order, so let's put them back."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pdf.pages.reverse()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Removing and adding pages is easy too."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "del pdf.pages[1:3]  # Remove pages 2-3 labeled \"second page\" and \"third page\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pdf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We've trimmed down the file to its essential first and last page. Now, let's add some content from another file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "appendix = Pdf.open('../../tests/resources/sandwich.pdf')\n",
    "pdf.pages.extend(appendix.pages)\n",
    "graph = Pdf.open('../../tests/resources/graph.pdf')\n",
    "pdf.pages.insert(1, graph.pages[0])\n",
    "pdf"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "Naturally, you can save your changes with ``.save(filename_or_stream)``. ``filename`` can be a :class:`pathlib.Path`, which we accept everywhere. (Saving is commented out to avoid upsetting the documentation generator.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# pdf.save('output.pdf')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using counting numbers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because PDF pages are usually numbered in counting numbers (1, 2, 3...), pikepdf\n",
    "provides a convenience accessor ``.p()`` that uses counting numbers:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pdf.pages.p(1)        # The first page in the document\n",
    "pdf.pages[0]          # Also the first page in the document\n",
    ";"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To avoid confusion, the ``.p()`` accessor does not accept Python slices, and ``.p(0)`` raises an exception.\n",
    "\n",
    "PDFs may define their own numbering scheme or different numberings for\n",
    "different sections. ``.pages`` does not look up this information."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. note::\n",
    "\n",
    "    Because of technical limitations in underlying libraries, pikepdf keeps the\n",
    "    original PDF from which a page from open, even if the reference to the PDF\n",
    "    is garbage collected."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. warning::\n",
    "\n",
    "    It's possible to obtain page information through the PDF ``/Root`` object as\n",
    "    well, but not recommend. The internal consistency of the various ``/Page``\n",
    "    and ``/Pages`` is not guaranteed when accessed in this manner, and in some\n",
    "    PDFs the data structure for these is fairly complex. Use the ``.pages``\n",
    "    interface."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "pikepdf",
   "language": "python",
   "name": "pikepdf"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
